Conditional Random Fields, Planted Satisfaction, and Entropy Concentration
نویسندگان
چکیده
This paper studies a class of probabilistic models on graphs, where edge variables depend on incident node variables through a fixed probability kernel. The class includes planted constraint satisfaction problems (CSPs), as well as more general structures motivated by coding and community clustering problems. It is shown that under mild assumptions on the kernel, the conditional entropy of the node variables given the edge variables concentrates around a deterministic threshold. This implies in particular the concentration of the number of solutions in a broad class of planted CSPs, the existence of a threshold function for the disassortative stochastic block model, and the proof of a conjecture posed by [RKPSS10] on parity check codes. It also establishes new connections among coding, clustering and satisfiability. ∗ Department of Electrical Engineering and Program in Applied and Computational Mathematics, Princeton University, Email: [email protected] † Departments of Electrical Engineering and Statistics, Stanford University, Email: [email protected]
منابع مشابه
Conditional Random Fields, Planted Constraint Satisfaction and Entropy Concentration
This paper studies a class of probabilistic models on graphs, where edge variables depend on incident node variables through a fixed probability kernel. The class includes planted constraint satisfaction problems (CSPs), as well as more general structures motivated by coding and community clustering problems. It is shown that under mild assumptions on the kernel and for sparse random graphs, th...
متن کاملCombination of Machine Learning Methods for Optimum Chinese Word Segmentation
This article presents our recent work for participation in the Second International Chinese Word Segmentation Bakeoff. Our system performs two procedures: Out-ofvocabulary extraction and word segmentation. We compose three out-of-vocabulary extraction modules: Character-based tagging with different classifiers – maximum entropy, support vector machines, and conditional random fields. We also co...
متن کاملConditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
We present conditional random fields , a framework for building probabilistic models to segment and label sequence data. Conditional random fields offer several advantages over hidden Markov models and stochastic grammars for such tasks, including the ability to relax strong independence assumptions made in those models. Conditional random fields also avoid a fundamental limitation of maximum e...
متن کاملA Preferred Definition of Conditional Rényi Entropy
The Rényi entropy is a generalization of Shannon entropy to a one-parameter family of entropies. Tsallis entropy too is a generalization of Shannon entropy. The measure for Tsallis entropy is non-logarithmic. After the introduction of Shannon entropy , the conditional Shannon entropy was derived and its properties became known. Also, for Tsallis entropy, the conditional entropy was introduced a...
متن کاملShallow Parsing with Conditional Random Fields
Conditional random fields for sequence labeling offer advantages over both generative models like HMMs and classifiers applied at each sequence position. Among sequence labeling tasks in language processing, shallow parsing has received much attention, with the development of standard evaluation datasets and extensive comparison among methods. We show here how to train a conditional random fiel...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1305.4274 شماره
صفحات -
تاریخ انتشار 2013